Loading in our Libraries
Read in 2023 Sustainable Development Data with read_csv() and here()
sdr_data <- read_csv(here("data/SDR-2023-Data.csv"))
Clean column names
sdr_data <- sdr_data %>%
clean_names()
Let’s start with the histogram you created last lesson with ggplot() and geom_histogram()
ggplot(sdr_data, aes(x = goal_4_score, fill=regions_used_for_the_sdr)) +
geom_histogram()
Looks Nice, but let’s improve it!
We start with the same exact code and add to it with +
ggplot(sdr_data, aes(x = goal_4_score, fill=regions_used_for_the_sdr)) +
geom_histogram() +
theme_minimal() +
scale_fill_viridis_d() +
labs(title = "Distributions of SDG 4 Scores",
x = "SDG 4 Score",
y = "Number of Countries",
fill = "Region")
Awesome, looking much better!
Interactive visualizations are a really exciting part of data science! Interactivity engages the viewer by allowing them to explore the data in a way they cannot with static visualizations
The great part is that with the ggplotly() function from the plotly package, making interactive visualizations is very simple
First we create the plot with the exact same code
The only difference is that we assign the plot a name with <- the same way we do with dataframes or lists
Next we put the name that we give the plot into the ggplotly function
goal_4_histogram <- ggplot(sdr_data, aes(x = goal_4_score, fill=regions_used_for_the_sdr)) +
geom_histogram() +
theme_minimal() +
scale_fill_viridis_d() +
labs(title = "Distributions of SDG 4 Scores",
x = "SDG 4 Score",
y = "Number of Countries",
fill = "Region")
ggplotly(goal_4_histogram)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 27 rows containing non-finite values (`stat_bin()`).
Epic! Now we can hover over the histogram and get some info
We can also double-click the boxes in the legend to only view the regions we’re interested in
leafletThis takes a few fun steps to do
Lucky for us the rnaturalearth package has this information for us
world <- ne_countries(scale = "medium", returnclass = "sf")
- Now we have a dataframe named *world* in our environment
- It has 241 locations/countries (rows) and 64 columns describing each location/country
- We are only interested in 3 columns:
world <- world %>%
select(name_long, iso_a3, geometry)
# Rename a column in a data frame or matrix
colnames(sdr_data)[which(colnames(sdr_data) == "country_code_iso3")] <- "iso_a3"
joined_df <- left_join(sdr_data, world, by = "iso_a3")
world_df_joined <- st_as_sf(joined_df)
world_df_joined <- st_transform(world_df_joined, "+proj=longlat +datum=WGS84")
mytext <- paste(
"Country: ", world_df_joined$country,"<br/>",
"Goal 7 Score: ", round(world_df_joined$goal_7_score, 2),
sep="") %>%
lapply(htmltools::HTML)
leaflet(world_df_joined) %>%
addTiles() %>%
setView( lat=10, lng=0 , zoom=2) %>%
addPolygons(stroke = FALSE, fillOpacity = 0.5, smoothFactor = 0.5, color = ~colorQuantile("YlOrRd", goal_7_score)(goal_7_score), label = mytext)